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A FAST  APPROXIMATION  TO  THE  COMPLEMENTARY  ERROR 
FUNCTION  FOR  USE  IN  FITTING  GAMMA-RAY  PEAKS 


I.  Introduction 

The  error  function  and  Its  complement  are  widely  used  In  fields 
such  as  computational  physics,  numerical  methods  and  statistical  analy* 
sis,  and  specifically  In  fitting  gamma-ray  peaks  for  quantitative 
analysis  of  spectra  from  germanium  detectors.  All  of  these  applications 
make  heavy  use  of  computer  calculations.  Unfortunately  these  functions 
tend  to  be  relatively  slow  to  calculate  precisely  on  a digital  computer. 
In  program  H7PEKMET  (1),  which  was  developed  at  the  Naval  Research  Labo- 
ratory (NRL)  for  automatic  peak  analysis  of  gamma-ray  spectra,  the 
complementary  error  function  appears  In  several  terms  of  the  peak-shape 
function  which  Is  used  In  an  Iterative  least-squares  fit  to  the  peaks. 
Timing  studies  showed  that  the  program  was  spending  up  to  80%  of  Its 
central  processing  unit  (cpu)  time  In  the  error-function  routine.  Since 
high  precision  was  not  required  In  this  application,  a search  was  made 
for  a fast  approximation  to  this  function. 

II.  Computation 

The  error  function  er/(x)  derives  Its  name  as  the  Integral  of  the 
normal  curve  of  error, 

er/(x)  ■ (2//ir)yjj  e”^  dt. 

It  Is  so  normalized  to  have  a range  of  -1  to  -fl  for  -*•  1 x ^ The 
complementary  error  function  Is  defined  by 


•rfcix) 


- (2//ir) 


e"‘^  dt 


■ 1 - sr/(x) , 


Note:  MamiMiipt  Mibnitted  Fobteary  18, 1979. 
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It  thus  has  a range  of  2 to  0 for  ^ x £ «.  On  the  Texas  Instruments 
ASC  Model  7 computer  at  NRL,  the  double  precision  (8-blt  exponent,  56-blt 
mantissa)  complementary  error  function  Is  calculated  (2)  as  follows: 

1.  For  (0  £ X a 1)  an  11th  order  polynomial  approximation  Pi(x) 

Is  used. 

2.  Within  the  range  (1  a x a 2.04)  various  14th  order  pol3momlal8 
P2(x)  are  used. 

3.  Within  the  range  (2.04  < x a 13.306)  various  functions  of  the 
form  e~^  P3(l/x^)  are  used,  where  P3  Is  a 14th  order 
polynomial . 

4.  For  (13.306  < x)  the  function  underflows  and  Is  set  to  0. 

5.  For  (x  < 0)  the  following  relation  Is  used. 

•r/c(x)  ■ 2 - erfc(-x) 

Most  non-llnear  least-squares  optimization  routines.  Including  that  used 
In  HTPERMET,  require  precise  derivatives.  For  the  complementary  error 
function  we  have 

^ (•r/c(x))  - - (2//it)  e"*^ 

III.  Application 

In  the  peak  shape  function  used  by  HTPERMET,  the  main  term  Is  a 
gausslan  of  width  5 and  amplitude  F 

fl(x)  • r •»p(-x2/d*). 

Added  to  this  Is  an  exponential  "skaw''  term  of  amplitude  a and  slope  6 
on  the  negative  x side  of  the  peak.  When  folded  with  a gausslan, 
representative  of  random  electronic  noise,  this  term  can  be  written  as 

f2(x)  ■ 0 exp(x/6)  cer/(x  - x^) 


I 
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where  the  function 


cerfix  - X ) • (1/2)  erfc(.x/6  + 6/26) 
o 


has  a range  of  1 to  0 for  (-«  1 x £ «<>)  and  serves  to  cut  off  the  exponen- 
tial part  of  f2  smoothly  for  large  x.  The  second  term  In  the  argument  of 
erfc  Introduces  an  offset  by  x^  ■ - 6^/26.  The  derivative  can  be  written 


derfix  - x^)  ■ ^ |^cer/(x  - x^) 


^cer/(x  - x^)J 


• - (l/6/ir)  exp^-(x/6  + 6/2B)^J. 


The  amplitude  a of  fa  is  usually  an  order  of  magnitude  less  than  the 
amplitude  T of  fi,  and  the  slope  6 Is  usually  less  than  or  of  the  order 
of  the  gausslan  width  6.  When  counting  statistics  are  very  good.  It  Is 
sometimes  necessary.  In  order  to  obtain  a good  fit,  to  add  a "tail"  term 
fa,  functionally  Identical  to  fa  but  with  a slope  an  order  of  magnitude 
larger,  and  a constant  "step"  term  fi^.  Both  fa  and  ft^  are  an  order  of 
magnitude  less  than  fa  In  amplitude,  and  both  are  also  cut  off  for  large 
X by  one-half  the  complementary  error  function.  Because  the  offset  term 
In  the  argument  of  erfc  differs  for  fi  and  fa  and  Is  zero  for  fa,  the  i 

calculation  of  the  function  cerf  and  Its  derivative  derf  Is  required  up 
to  three  times  at  each  data  channel  (corresponding  to  x)  of  the  region 
, being  fit  and  for  each  Iteration  of  the  fit.  j 

I ^ 

IV.  Approximation  | 

Since  the  amplitudes  of  terms  fa»  fa  and  In  which  the  complemen- 
I tary  error  function  appears  are  small  relative  to  the  main  gausslan  term 

I 

fl,  since  the  cutoff  term  cerf  Is  very  different  from  either  1 or  0 only  ! 

for  small  absolute  x (where  fi  is  large),  and  since  the  reasons  given  i 

for  Inclusion  of  this  term  in  the  peak-shape  function  are  largely  | 

empirical,  a simpler  approximation  to  cerf  should  perform  Just  as  well  j 

In  the  fit.  (However,  whichever  approximation  is  used.  Its  value  and  the  | 

value  of  Its  derivative  must  be  calculated  precisely  at  each  channel  and  { 
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for  each  Iteration  in  order  to  avoid  cumulative  errors  during  the  fitting 
process.)  Several  approximations  to  the  error  function  are  given  in 
Ref.  (3).  The  following  was  chosen  as  sufficiently  accurate  and  conve- 
nient for  this  purpose:  let 

cutf(x)  ■ (bjt  + b^t^  + b3t^)  exp(-x^) 

for  X > 0 and  t - 1/(1  + px).  The  constant  p • 0.470  47  and  the 
polynomial  coefficients  are 

bi  - 0.174  012  1 
b2  - 0.047  939  9 
b3  - 0.373  927  8. 


For  X less  than  0, 


If  let 


cutf(x)  ■ 1 - cutf(-x). 


cerf(x)  ■ cut/(x)  + e(x). 


the  magnitude  of  the  error  term  is  stated  (3)  as 


|e(x)|  S 1.25  X 10*5. 


The  exact  derivative  is  given  by 


dutfCx)  • (cut/(x)) 


- (2x)  cut/(x) 


- pt^(bi  + 2tb2  + 3t^b3)  exp(-x^). 
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V.  Tests 

The  functions  cutf  and  dutf  were  tested  on  the  ASC  7 at  NRL  for 
speed  and  accuracy  against  the  functions  cerf  and  derf.  The  ASC  Is  a 
two-plpe-line  vector-oriented  machine  which  in  an  overlapped  fasion  can 
retrieve  arrays  of  data,  perform  simple  algebraic  manipulations  on  the 
arrays,  and  store  the  results.  For  arrays  of  length  i.  10,  this  is 
usually  much  more  efficient  than  the  corresponding  scalar  operations  on 
the  data  element  by  element.  Many  of  the  standard  mathematical  functions 
have  both  vector  and  scalar  routines  available  for  their  calculation, 
and  the  FORTRAN  compiler  Is  written  so  as  to  automatically  Invoke  the 
vector  routines  when  applicable  for  processing  expressions  Involving 
arrays  In  the  source  code.  Unfortunately,  the  complementary  error  func- 
tion Is  available  only  as  a scalar  routine. 

For  this  test  two  routines  (4)  were  written  in  FORTRAN  to  calculate 
cutf  and  dutf  for  x varying  between  -5.0  and  44.9  In  100  steps  of  0.1 
each,  which  covers  a range  typical  for  program  HYPERMET.  One  routine 
was  written  to  store  Intermediate  results  In  arrays  In  order  to  compile 
efficiently  In  vector  code.  The  second  routine  was  written  to  compile 

efficiently  in  scalar  code.  A third  program  was  written  to  calculate 

cerf  and  derf  for  the  same  steps  over  the  same  range.  Then  the  lower 
and  upper  limits  of  the  range  of  calculation  were  each  Incremented  100 
times  In  steps  of  0.01  and  the  calculations  repeated,  for  a total  of 
10,000  calculations  of  each  function.  Finally,  the  above  calculations 
were  repeated  ten  times  for  a grand  total  of  100,000  calculations  of 
each  function.  The  source  code  was  compiled  and  executed  In  double 
precision  at  three  different  optional  levels  of  compiler  optimization  (5) , 

1.  1-lcvel,  scalar  code  with  direct  translation  of  source 
code,  no  optimization 

2.  J-level,  scalar  code  with  local  and  global  optimization 

3.  K-level,  vector  code  with  local  and  global  optimisation. 

VI.  Accuracy 

In  Figs.  1 and  2 arc  plotted  the  results  of  the  calculation  for 
cutf  and  dutf  and  the  errors 
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apsc(x)  ■ cerf(x)  - cutf(x) 

0p5d(x)  ■ derf(x)  - dutf(x) 
between  x ■ 0.0  and  5.0. 

For  negative  x,  cutf(x)  ■ 1.0  - cut£(-x),  while  the  other  curves 
are  synnetrlc  about  x"0.  Over  the  range  of  0 ^ x i,  1.65,  the  error 
curve  apse  oscillates  between  ± 1.10  x 10”®.  The  sign  of  the  error  is 
Indicated  in  parentheses  and  changes  at  each  cusp  In  the  curve  as  the 
error  passes  through  zero.  For  x > 1.65,  epsc  declines  gradually  with 
cut/.  At  x«5,  cut/  • 7.99  x 10”^®  and  lepsc|  ■ 3.08  x 10”^**  or  about 
Similar  behavior  Is  observed  for  the  error  epsd  In  the  derivative  dut/. 

VII.  Timing 

Table  1 gives  the  epu  execution  times  for  100,000  fast-approxlmatlon 
calculations  of  cut/  and  dut/  in  both  the  scalar-  and  vector-coded 
versions  compared  to  the  times  for  the  full  calculation  of  cerf  and  derf, 
all  codes  compiled  and  executed  In  double  precision  on  the  Texas  Instru- 
ments ASC  7 at  NRL.  Results  are  given  for  three  different  compiler  levels 
I,  J,  and  K as  described  previously.  Table  1 shows  significant  savings 
at  all  compiler  levels,  with  a maximum  savings  at  the  K compiler  level 
for  the  vector-coded  version  of  the  fast  approximation,  which  executes 
In  23Z  of  the  time  required  for  the  full  calculation.  At  the  I and  J 
levels,  maximum  savings  are  found  for  the  scalar-coded  version,  which 
executes  In  about  60%  of  the  time  required  for  the  full  calculation. 

In  Table  2 the  lines  of  code  needed  to  calculate  the  derivatives 
were  deleted  before  compilation.  Here  the  savings  are  similar  st  the  R 
level  but  smaller  at  the  I and  J levels.  In  fact,  the  vector-coded 
version  at  the  I level  requires  about  13X  longer  time  to  execute  than 
does  the  full  calculation.  This  Is  somewhat  surprising  and  apparently 
represents  the  economy  of  an  optimized  assembly  language  code  versus  an 
optimized  FORTRAN-complled  code.  For  the  scalar  code,  this  seems  to 
Indicate  that  large  savings  can  be  realized  by  coding  frequently  used 
functions  In  assembly  language  rather  than  FORTRAN. 
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VIII.  Results 

In  program  HYPEFMET,  subroutine  FUNC  which  calculates  the  peak- 
shape  function  has  been  rewritten  In  order  to  produce  efficient  vector 
code  at  the  K-level  of  compiler  optimization.  This  change  resulted  In  a 
reduction  In  execution  time  to  about  40%  of  that  required  for  the  previous 
version.  The  vector-coded  fast  approximation  for  the  functions  cutf  and 
dutf  was  then  added  In-line  at  each  of  three  locations,  at  the  code  for 
calculating  f^,  f2  and  f3  where  the  functions  cerf  and  derf  had  been  used 
previously.  With  this  change  another  factor  of  two  In  execution  speed 
was  achieved.  As  the  final  result  of  these  changes,  the  program  HYFERMET 
now  executes  on  the  ASC  7 In  about  one-fifth  the  time  required  by  the 
previous  version  to  analyze  a typical  gamma-ray  spectrum. 
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Table  1 


Complementary  Error  Function  and  Derivatives, 
Time  (sec.)  for  100,000  Calculations 


Fast  Approximation 

Full  Calculation 

Compiler  Level 

Vector  Code 

Scalar  Code 

Scalar  Code 

I 

7.23 

5.63 

8.82 

J 

5.50 

5.04 

8.49 

K 

1.35 

2.86 

5.95 

Table  2. 

Complementary  Error  Function  Only 
Time  (sec.)  for  100,000  Calculations 


Fast  Approximation 

Full  Calculation 

Compiler  Level 

Vector  Code 

Scalar  Code 

Scalar  Code 

I 

5.94 

4.85 

5.23 

J 

4.66 

4.41 

5.10 

K 

1.19 

1.40 

5.02 
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Fig.  1 • Plot  of  ch«  fast  approximation  CUTF  to  tha  complamentary 
arror  function,  and  tha  arror  EFSC.  aa  dafinad  in  tha  taxt.  Plot- 
tad  ia  tha  loSig  of  tha  abaoluta  valua  of  tha  functiona.  Iha  aign 
ia  indicatad  in^paranthaaaa  abova  tha  eurva.  Tha  euapa  in  tha  curva 
for  EPSC  raault  from  aign  ehangaa  aa  tha  function  paaaaa  through 
zaro. 


9 


Fig.  2 • Plot  of  th«  dorlvatlv*  DUTF  of  th«  faat  approxlaatlon 
to  tha  eomplanantary  arror  function  CUTF,  and  tha  arror  EPSD 
aa  daflnad  In  tha  taxt.  Tha  algn  la  Indlcatad  In  paranthaaaa 
abova  tha  eurva.  Tha  cuapa  In  tha  eurva  for  EPSD  raault  fron 
algn  ehangaa  aa  tha  function  paaaaa  through  aaro. 
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APPENDIX 


PROGRAM  CUTFV 

PROGRAM  FOR  TIMING  TESTS  OF  A RATIONAL  APPROXIMATION  TO  THE 
COMPLEMENTARY  ERROR  FUNCTION  CUTF  AND  ITS  DERIVATIVE  DUTF 
VECTOR  CODED  VERSION 

IMPLICIT  REAL*8(A-H,0-Z) 

DIMENSION  GAUSN(IOO) ,T3(100) ,T1(100) ,T2(100) ,U(100) 

DIMENSION  CUTF(100),DUTF(100) 

DIMENSION  FT(100),DFT(100),DFX(100) 

DATA  A1/0.1740121D00/,A2/-0.0479399D00/ 

DATA  \3/0.3739278DOO/,AP/0.47047DOO/ 

C 

N-lOO  I 

DU-O.OIDOO  ' 

DO  1000  KX-1,10 
DO  1000  IX-1,100 

UO— 5 . 11D00+DU*DFL0AT  (IX) 

DO  100  I-1,N 

U(I)-UOfO. 1D00*DFL0AT(I) 

GAUSN(I)-DEXP(-U(I)**2) 

100  CONTINUE  j 

K-IDINT(-U0*10.)  J 

DO  200  I-1,K 
U(I)— Ud) 

200  CONTINUE 

DO  300  I-1,N 

T1 (I)-l .DOO/ (1 . D00tAP*U(I) ) 

T2(I)-T1(I)*T1(I) 

T3(I)-T1(I)*T2(I)  I 

FT(I)-A1*T1(I)+A2*T2(I)+A3*T3(I)  ! 

DFT(I)-Al+2 .D00*A2*T1(I)+3.D00*A3*T2 (I) 

DFX(I)— DFT(I)*AP*T2  (I) 

300  CONTINUE 

I DO  400  I-1,N 

CUTF(I)-FT(I)*GAUSN(I) 

t DUTF(I)-DFX(I)*GAUSN(I)-2.D00*U(I)*CUTF(I) 

I 400  CONTINUE 

DO  500  I-1,K 

500  CUTF(I)-1.D00-CUTF(I) 

1000  CONTINUE 

. C 

1 STOP 

END 
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PROGRAM  CUTFS 

PROGRAM  FOR  TIMING  TESTS  OF  A RATIONAL  APPROXIMATION  TO  THE 
COMPLEMENTARY  ERROR  FUNCTION  CUTF  AND  ITS  DERIVATIVE  DUTF 
SCALAR  CODED  VERSION 
C 

IMPLICIT  REAL*8(A-H,0-Z) 

DIMENSION  U(IOO) 

DIMENSION  CUTF(100),DUTF(100) 

DATA  Al/0.1740121D00/,A2/-0.0479399DO0/ 

DATA  A3/0.3739278DO0/,AP/0.47047DOO/ 

C 

N-lOO 

DU-O.OIDOO 
DO  1000  KX-1,10 
DO  1000  IX-1,100 

UO— 5 . 11DOO+DU*DFLOAT  (IX) 

DO  100  I-1,N 

U ( I ) -UOfO  ..1DOO*DFLOAT  ( 1 ) 

UA-DABS(U(I)) 

T- 1 . DOO/ ( 1 . DOO+AP*UA) 

HN-DEXP(-U(I)**2) 

CUTF(I)-HN*T*(A1+T*(A2+T*A3) ) 

DUTF ( I ) — HN*AP*T*T* ( Al+T* ( 2 . D00*A2+T*  3 . DOO*A3 ) ) 

* -2.D00*UA*CUTF(I) 

100  CONTINUE 

K-IDINT(U-U0*10. ) 

DO  200  I-1,K 

200  CUTF(I)-1.D00-CUTF(I) 

1000  CONTINUE 
C 

STOP 

END 
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PROGRAM  CERF 

PROGRAM  FOR  TIMING  TESTS  OF  THE  ASC  LIBRARY  ROUTINE  FOR  THE 
COMPLEMENTARY  ERROR  FUNCTION  CERF  AND  ITS  DERIVATIVE  DERF 

IMPLICIT  REAL*8(A-H,0-Z) 

DIMENSION  U(IOO) 

DIMENSION  CERF (100) ,DERF(100) 

C 

RPI-0.5641893SD00 

N-lOO 

DU-O.OIDOO 
DO  1000  KX-1,10 
DO  1000  lX-1,100 

UO— 5 . llDOOfDU*DFLOAT(IX) 

DO  100  I-1,N 

U(l)-UOfO. 1DOO*DFLOAT(I) 

CERF(I)-0. 5D00*DERFC (U(I) ) 

DERF  (D— RPI*DEXP  (-U  (I)  **2  ) 

100  CONTINUE 
1000  CONTINUE 
C 

STOP 

END 
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